Skip to content

Conversation

@djsaunde
Copy link
Member

@djsaunde djsaunde commented Aug 30, 2025

Description

Title. Move to uv over pip commands for installing from PyPI (uv add) and from source (uv sync), update Docker images and CI to use uv everywhere.

torch is now installed as part of the regular installation process. However, uv will respect the pre-installed torch version in the current environment (so long as it's within the pyproject.toml bounds.

I've also made flash-attn and deepspeed platform-dependent, non-optional dependencies, since it seems we always recommend users to install (and seems like we only did this for macos incompatibility reasons?), but of course this is up for debate.

Motivation and Context

  • uv is quite fast --> translates to users' install times, CI build times
  • uv has nice features (better resolution -> fewer dependency conflicts, torch autoselect (technically no need to preinstall torch unless you want an older supported version), automatic wheel selection, deterministic resolution)
  • multiple requirements.txt -> pyproject.toml
  • setuptools_scm auto-versioning 😌
  • auto-gptq -> gptqmodel (caused uv locking issues)

How has this been tested?

  • manually
  • CI (in progress)
  • docker image builds (in progress)
  • benchmarks
  • auto-versioning

Tests CI comparison

pip:
image

uv:
image

Source install comparison

System info:

  • OS: Ubuntu 24.04.1 LTS
  • CPU: AMD Ryzen 7 3700X 8-Core Processor (16 threads)
  • Memory: 16GB RAM

Warm start

To compare apples to apples, I started with no torch installed and benchmarked the from-source install runtimes for both main (pip install -e . ...) and this branch (uv sync).

pip: 2m 45s (script)
uv: 2s (script)

Thanks to pre-resolving dependencies (via uv.lock), uv is super fast.

The real-world runtime is not quite right: in both cases, the needed packages are already cached locally so there's a warm start.

Cold start

Note that this benchmark will depends on my download speed (peak ~500MiB/s).

pip: 4m 34s (script)
uv: 1m 24s (script)

pip is ~3.25x slower than uv.

In real-world use, I expect the average user will have some packages cached, but not all, so the runtimes should be somewhere between the warm and cold start cases. Anyways, we still save ~2-3m in either case.

Summary by CodeRabbit

  • Documentation
    • Updated installation guides to recommend uv with quick-start and advanced paths; simplified README requirements and examples; integration docs now use uv commands.
  • Chores
    • Migrated packaging to pyproject.toml with explicit dependencies; removed setup.py and legacy requirements files.
    • Switched CI and Docker workflows to uv-based installs; base images/tags now -uv; extended CUDA arch list.
    • Adjusted manifests and coverage config to align with new packaging.
  • Tests
    • Updated CI triggers and nightly/e2e pipelines; modal invocations now use -m filtering.

@djsaunde djsaunde self-assigned this Aug 30, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 30, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Migration to uv-based package management across CI, Docker, scripts, and docs. Packaging moves from setup.py/requirements*.txt to explicit pyproject.toml with src layout. Multiple GitHub workflows and Dockerfiles updated to uv, base image tags adjusted, dynamic dependency tooling removed, and documentation updated accordingly.

Changes

Cohort / File(s) Summary of changes
Packaging modernization
pyproject.toml, setup.py, MANIFEST.in, requirements*.txt, src/setuptools_axolotl_dynamic_dependencies.py
Introduced explicit project metadata, dependencies, optional extras, tooling configs in pyproject.toml; removed setup.py and dynamic dependency module; switched manifest to include pyproject.toml; emptied/removed requirements files.
CI workflows to uv
.github/workflows/* (e.g., docs.yml, lint.yml, multi-gpu-e2e.yml, precommit-autoupdate.yml, preview-docs.yml, pypi.yml, tests-nightly.yml, tests.yml)
Added uv setup steps; replaced pip installs with uv pip --system; adjusted triggers to pyproject.toml; updated BASE_TAG to include -uv; modified modal invocations (-m); removed pip cache/upgrade steps; updated nightly dependency patching to pyproject.toml.
Docker and CI templates
docker/Dockerfile, cicd/Dockerfile.jinja, cicd/Dockerfile-uv.jinja, .runpod/Dockerfile, cicd/multigpu.py, cicd/single_gpu.py
Switched base images to -uv variants; unified on uv for installs; updated CUDA arch list; simplified extras (removed deepspeed/flash-attn from default lists); default BASE_TAGs now include -uv; deleted UV-specific Dockerfile template.
Install scripts to uv
scripts/unsloth_install.py, scripts/cutcrossentropy_install.py
Removed optional UV flags; now always emit uv pip --system uninstall/install commands.
Documentation updates
README.md, docs/installation.qmd, src/axolotl/integrations/cut_cross_entropy/README.md, .github/CONTRIBUTING.md
Added uv-first installation paths; revised commands to uv; adjusted requirements text; updated contributor install instructions to use uv editable dev install.
Coverage config
.coveragerc
Removed setup.py from omit list (setup.py is now deleted).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • winglian
  • NanoCode012
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch uv-first

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@djsaunde djsaunde linked an issue Aug 30, 2025 that may be closed by this pull request
@djsaunde djsaunde marked this pull request as ready for review August 30, 2025 14:43
@djsaunde djsaunde marked this pull request as draft August 30, 2025 14:43
@github-actions
Copy link
Contributor

github-actions bot commented Aug 30, 2025

📖 Documentation Preview: https://68e147d3d626edbbb2feb89e--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit ffb307a

@codecov
Copy link

codecov bot commented Aug 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
scripts/unsloth_install.py (2)

10-16: Handle CPU-only installs gracefully.

torch.version.cuda can be None and cuda.get_device_capability() can raise on CPU; current code raises and breaks CI unnecessarily.

Apply:

+import sys
@@
-cuda = str(torch.version.cuda)
+cuda_raw = torch.version.cuda
+if not torch.cuda.is_available() or cuda_raw is None:
+    print('echo "CUDA not available; skipping unsloth install"')
+    sys.exit(0)
+cuda = str(cuda_raw)
@@
-try:
-    is_ampere = torch.cuda.get_device_capability()[0] >= 8
-except RuntimeError:
-    is_ampere = False
+try:
+    is_ampere = torch.cuda.get_device_capability()[0] >= 8
+except Exception:
+    is_ampere = False
@@
-if cuda != "12.1" and cuda != "11.8" and cuda != "12.4":
-    raise RuntimeError(f"CUDA = {cuda} not supported!")
+SUPPORTED_CUDA = {"11.8", "12.1", "12.4", "12.6", "12.8"}
+if cuda not in SUPPORTED_CUDA:
+    print(f'echo "CUDA={cuda} not supported by unsloth; skipping"')
+    sys.exit(0)

17-33: Don’t hard-fail on newer Torch in CI—skip with a message.

Your matrix includes 2.6–2.8; this script raises “too new!” and (combined with piping) may silently skip installs later.

Apply:

@@
-else:
-    raise RuntimeError(f"Torch = {v} too new!")
+else:
+    print(f'echo "unsloth not available for torch {v}; skipping"')
+    sys.exit(0)
docs/installation.qmd (1)

60-64: Remove “source $HOME/.local/bin/env” (non-standard).

This file typically doesn’t exist; users will see “No such file or directory.”

Apply:

 curl -LsSf https://astral.sh/uv/install.sh | sh
-source $HOME/.local/bin/env
+export PATH="$HOME/.local/bin:$PATH"
.github/workflows/multi-gpu-e2e.yml (1)

66-74: matrix.axolotl_args is undefined.

actionlint flags this; it will fail at expression time.

Either add a default to each include or stop echoing it. Example (add empty default):

       matrix:
         include:
           - cuda: 126
             cuda_version: 12.6.3
             python_version: "3.11"
             pytorch: 2.6.0
+            axolotl_args: ""
             axolotl_extras:
             num_gpus: 2
             nightly_build: "true"
           - cuda: 126
             cuda_version: 12.6.3
             python_version: "3.11"
             pytorch: 2.7.1
+            axolotl_args: ""
             axolotl_extras: vllm
             num_gpus: 2
             nightly_build: "true"
           - cuda: 128
             cuda_version: 12.8.1
             python_version: "3.11"
             pytorch: 2.8.0
+            axolotl_args: ""
             axolotl_extras:
             num_gpus: 2
             nightly_build: "true"
.github/workflows/tests.yml (3)

251-259: matrix.axolotl_args is undefined here as well.

Same issue as multi-gpu-e2e; actionlint flags it.

Either define it in the matrix includes or remove the echo:

-          echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV

314-323: matrix.axolotl_args undefined (repeat).

Clean it up as above.

Apply the same diff removing the AXOLOTL_ARGS echo or define the matrix field.


358-367: matrix.axolotl_args undefined (repeat).

Clean it up as above.

Apply the same diff removing the AXOLOTL_ARGS echo or define the matrix field.

.github/workflows/tests-nightly.yml (2)

126-133: Matrix variable axolotl_args is undefined (actionlint error).
This will fail at expression evaluation.

Fix by providing a default or by adding it to the matrix:

-echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV
+echo "AXOLOTL_ARGS=${{ matrix.axolotl_args || '' }}" >> $GITHUB_ENV

Or define axolotl_args in the matrix include entries.


171-178: Repeat: axolotl_args undefined here too.
Same actionlint finding applies.

-echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV
+echo "AXOLOTL_ARGS=${{ matrix.axolotl_args || '' }}" >> $GITHUB_ENV
🧹 Nitpick comments (29)
.github/CONTRIBUTING.md (1)

34-35: Prefer uv sync and document uv installation.

  • Use uv’s resolver/lock with a managed venv for contributors; fall back to --system only when needed.
  • Add a quick note/link on installing uv so newcomers aren’t blocked.
-uv pip install -e .[dev]
-
-pre-commit install
-# test
-pytest tests/
+## Option A (recommended): managed virtualenv
+uv sync --all-extras --dev
+
+## Option B: install into system Python (CI/containers)
+# uv pip install --system -e .[dev]
+
+pre-commit install
+# test
+pytest -q
.github/workflows/preview-docs.yml (2)

43-47: Pin uv action input for stability.

“latest” can introduce surprise CI breakages. Either rely on the pinned action major (v4) with a minimum uv version or pin a known-good.

-uses: astral-sh/setup-uv@v4
-with:
-  version: "latest"
+uses: astral-sh/setup-uv@v4
+with:
+  version: ">=0.4.28"

50-51: Consider uvx for tool invocations to avoid system installs.

Keeps the runner cleaner and uses the resolved versions.

-uv pip install --system jupyter quartodoc
-uv pip install --system -e .
+uv pip install --system -e .
+uvx pip install jupyter quartodoc  # or keep as-is; alternatively:
+# uvx quartodoc build  (see next steps)

Follow-up (optional):

- run: quartodoc build
+ run: uvx quartodoc build
.github/workflows/lint.yml (1)

8-13: Trigger lint on uv.lock changes too.

If config/deps shift, lint hooks (e.g., import-order, pyproject-based tooling) can change; include uv.lock in paths.

        - '**.py'
        - 'pyproject.toml'
+       - 'uv.lock'
        - '.github/workflows/*.yml'
        - "*.[q]md"
        - "examples/**/*.y[a]?ml"
        - ".pre-commit-config.yaml"
.github/workflows/precommit-autoupdate.yml (2)

21-25: Pin uv input for reproducible CI.

-uses: astral-sh/setup-uv@v4
-with:
-  version: "latest"
+uses: astral-sh/setup-uv@v4
+with:
+  version: ">=0.4.28"

29-29: Use uvx to run pre-commit without installing it system-wide.

Simpler and avoids polluting the runner image.

-uv pip install --system pre-commit
-pre-commit autoupdate
+uvx pre-commit autoupdate
.github/workflows/docs.yml (2)

23-26: Pin uv instead of “latest” for reproducible CI

Avoid surprise breakages from upstream. Pin a specific uv version (and bump intentionally).

Example:

-          with:
-            version: "latest"
+          with:
+            version: "0.X.Y"  # pin and bump deliberately

29-30: Use uv sync and the lockfile; avoid ad-hoc installs for deterministic docs builds

Direct “uv pip install …” ignores uv.lock. Prefer syncing the project (optionally with a docs extra) and running quartodoc via the project env.

-            uv pip install --system jupyter quartodoc
-            uv pip install --system -e .
+            uv sync --frozen

Also update the build step to use the project env:

-        - name: Build autodoc
-          run: quartodoc build
+        - name: Build autodoc
+          run: uv run quartodoc build
.runpod/Dockerfile (1)

4-5: Harden uv install and reduce image size

  • Pin uv version in the installer (supply-chain hygiene).
  • Prune uv cache after install to trim the image.
  • If an “-uv” base image exists for this stack, prefer it to avoid curl|sh at build time.
-RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
-    /root/.local/bin/uv pip install --system -r /requirements.txt
+RUN curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v 0.X.Y && \
+    /root/.local/bin/uv pip install --system -r /requirements.txt && \
+    /root/.local/bin/uv cache prune -a

If you choose to pin via an ARG:

# add near top (outside the changed hunk)
ARG UV_VERSION=0.X.Y

And then:

-curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v 0.X.Y
+curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v ${UV_VERSION}
scripts/cutcrossentropy_install.py (1)

24-29: Align messaging and flags with the repo’s uv-first posture

  • The ImportError suggests “pip install torch”. Recommend updating to uv for consistency.
  • Optional: include “--system” in the uninstall line for symmetry (install already uses it).

Proposed tweak (outside this hunk):

-except ImportError as exc:
-    raise ImportError("Install torch via `pip install torch`") from exc
+except ImportError as exc:
+    raise ImportError("Install torch via `uv add torch` (project) or `uv pip install --system torch`") from exc

If you want symmetry on uninstall (not strictly required):

-        UNINSTALL_PREFIX = "uv pip uninstall -y cut-cross-entropy && "
+        UNINSTALL_PREFIX = "uv pip uninstall --system -y cut-cross-entropy && "
src/axolotl/integrations/cut_cross_entropy/README.md (1)

20-23: Add “--system” to manual install to mirror scripts and avoid env ambiguity

This matches the printed command from scripts/cutcrossentropy_install.py and prevents accidental venv creation/mismatch.

-uv pip uninstall -y cut-cross-entropy && uv pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@c6a32c5"
+uv pip uninstall --system -y cut-cross-entropy && uv pip install --system "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@c6a32c5"
.github/workflows/pypi.yml (2)

41-45: Pin the uv installer action and/or uv version.

Using “latest” can break releases if upstream changes.

Example:

-        with:
-          version: "latest"
+        with:
+          version: "0.5.x"

46-50: Avoid installing dev extras on the publish job.

They’re unnecessary for building; they slow the workflow and can pull test-only deps.

Lean build:

-      - name: Install dependencies
-        run: |
-          uv pip install --system wheel packaging==23.2
-          uv pip install --system --no-build-isolation -e ".[dev]"
+      - name: Install build tooling
+        run: |
+          uv pip install --system wheel packaging==23.2 build
@@
-      - name: Build package
-        run: |
-          uv pip install --system build
-          python -m build
+      - name: Build package
+        run: python -m build

Also applies to: 55-59

scripts/unsloth_install.py (1)

6-6: Update the guidance to uv-first.

The ImportError message still instructs “pip install torch”.

Use:

-    raise ImportError("Install torch via `pip install torch`") from error
+    raise ImportError("Install PyTorch first (e.g., `uv pip install torch` or follow https://pytorch.org/get-started/locally/)") from error
docs/installation.qmd (3)

22-30: Clarify Torch install guidance now that uv can install it automatically.

This callout conflicts with uv-first messaging.

Proposed text:

  • “uv will install the appropriate PyTorch build automatically. To pin a specific CUDA build, set UV_TORCH_BACKEND (see Advanced uv Installation) or follow PyTorch’s official instructions.”

32-43: Add optional CUDA backend pin to quick start.

Helpful for GPU users to get the right wheel on first try.

Example:

 For a quick installation with uv:
 
 ```{.bash}
 # Install uv if not already installed
 curl -LsSf https://astral.sh/uv/install.sh | sh
+export UV_TORCH_BACKEND=cu126  # or cu128 / cpu
@@
 uv pip install --no-build-isolation axolotl

---

`74-81`: **Unnecessary bootstrap deps.**

packaging/setuptools/wheel aren’t needed before installing torch with uv.

You can drop the first line:

```diff
-uv pip install packaging setuptools wheel
 uv pip install torch==2.6.0
 uv pip install awscli pydantic
.github/workflows/multi-gpu-e2e.yml (1)

57-61: Pin the uv installer action and/or uv version.

Avoid “latest” drift in CI.

Same suggestion as in pypi.yml.

.github/workflows/tests.yml (2)

80-81: Pin torchvision to the matching torch series.

Mixing major/minor can cause ABI/runtime issues.

Example:

-          uv pip install --system torch==${{ matrix.pytorch_version }} torchvision
+          uv pip install --system torch==${{ matrix.pytorch_version }} "torchvision==${{ matrix.pytorch_version }}.*"

242-246: Pin the uv installer action and/or uv version.

Avoid breakage from upstream “latest”.

Same suggestion as in other workflows.

Also applies to: 305-309, 350-354

.github/workflows/tests-nightly.yml (2)

55-62: sed-based pyproject.toml edits are brittle.
If the dependency line changes (different quoting, specifier, extras), sed may miss or overmatch.

Use a small Python TOML edit to be robust:

- sed -i 's#"transformers==.*"#"transformers @ git+https://github.com/huggingface/transformers.git@main"#' pyproject.toml
+ python - <<'PY'
+ import tomllib, tomli_w, re
+ p="pyproject.toml"
+ data=tomllib.loads(open(p,"rb").read())
+ deps=data["project"]["dependencies"]
+ def repl(name,url):
+   for i,s in enumerate(deps):
+     if re.fullmatch(fr'"?{name}==[^"]+"?', s) or s.startswith(f'{name}=='):
+       deps[i]=f'{name} @ {url}'
+ repl("transformers","git+https://github.com/huggingface/transformers.git@main")
+ repl("peft","git+https://github.com/huggingface/peft.git@main")
+ repl("accelerate","git+https://github.com/huggingface/accelerate.git@main")
+ repl("trl","git+https://github.com/huggingface/trl.git@main")
+ repl("datasets","git+https://github.com/huggingface/datasets.git@main")
+ open(p,"wb").write(tomli_w.dumps(data).encode())
+ PY

181-181: Align modal invocation style with earlier step.
Use -m here as well if you intend module mode.

-modal run cicd.multigpu
+modal run -m cicd.multigpu
pyproject.toml (2)

108-147: Heavy defaults; consider slimming base deps.
Gradio, lm_eval, and other tools inflate base install. If not essential at runtime, move them into extras to reduce install surface and solver pressure.

Happy to propose a split if desired.


162-188: Duplicate “dev” dependency declarations (project extras vs tool.uv).
Maintaining both risks drift.

Pick one source of truth (recommend: keep [project.optional-dependencies].dev) and remove [tool.uv].dev-dependencies, then install with -e ".[dev]".

-[tool.uv]
-dev-dependencies = [
-    "pytest",
-    "pytest-cov",
-    "pytest-xdist",
-    "pre-commit",
-    "ruff",
-    "mypy",
-]
+[tool.uv]
+# dev-dependencies intentionally omitted; use project extra: uv pip install -e ".[dev]"

Also applies to: 255-263

docker/Dockerfile (1)

31-31: Pin pytest under uv too (system).
Keep install target consistent.

-uv pip install pytest
+uv pip install --system pytest
cicd/Dockerfile.jinja (4)

1-1: Fix hadolint parser error (DL1000) on templated FROM

Hadolint can’t parse Jinja in FROM, causing “unexpected 'B'”. Either ignore DL1000 or switch to ARG+parameter expansion so the file remains valid Docker syntax.

Apply one of the following:

Option A (ignore):

+## hadolint ignore=DL1000
 FROM axolotlai/axolotl-base-uv:{{ BASE_TAG }}

Option B (Docker-parseable + templated default):

- FROM axolotlai/axolotl-base-uv:{{ BASE_TAG }}
+ARG BASE_TAG="{{ BASE_TAG }}"
+FROM axolotlai/axolotl-base-uv:${BASE_TAG}

13-15: Reduce image size: clean apt lists in same layer

Apt cache isn’t cleaned; this bloats the image.

 RUN apt-get update && \
-    apt-get install -y --allow-change-held-packages vim curl nano libnccl2 libnccl-dev ibverbs-providers ibverbs-utils infiniband-diags librdmacm-dev librdmacm1 rdmacm-utils slurm-wlm
+    apt-get install -y --allow-change-held-packages vim curl nano libnccl2 libnccl-dev ibverbs-providers ibverbs-utils infiniband-diags librdmacm-dev librdmacm1 rdmacm-utils slurm-wlm && \
+    rm -rf /var/lib/apt/lists/*

45-45: Align dev install with lock and avoid duplicate editable installs

This performs a second editable install, re-resolving deps and bypassing uv.lock.

- RUN uv pip install -e .[dev]
+ # Deterministic dev deps; reuse lock
+ RUN uv sync --frozen -E dev

If you keep pip form, add --system:

- RUN uv pip install -e .[dev]
+ RUN uv pip install --system -e ".[dev]"

34-34: Install with --system to avoid stray venvs
In Docker, prefer --system to install directly into the image’s Python and prevent accidental .venv creation.

-RUN uv pip install packaging==23.2 setuptools==75.8.0
+RUN uv pip install --system packaging==23.2 setuptools==75.8.0
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0094a2d and 6c32d4c.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (28)
  • .coveragerc (0 hunks)
  • .github/CONTRIBUTING.md (1 hunks)
  • .github/workflows/docs.yml (1 hunks)
  • .github/workflows/lint.yml (1 hunks)
  • .github/workflows/multi-gpu-e2e.yml (2 hunks)
  • .github/workflows/precommit-autoupdate.yml (1 hunks)
  • .github/workflows/preview-docs.yml (1 hunks)
  • .github/workflows/pypi.yml (1 hunks)
  • .github/workflows/tests-nightly.yml (4 hunks)
  • .github/workflows/tests.yml (6 hunks)
  • .runpod/Dockerfile (1 hunks)
  • MANIFEST.in (1 hunks)
  • README.md (1 hunks)
  • cicd/Dockerfile-uv.jinja (0 hunks)
  • cicd/Dockerfile.jinja (2 hunks)
  • cicd/multigpu.py (1 hunks)
  • cicd/single_gpu.py (1 hunks)
  • docker/Dockerfile (2 hunks)
  • docs/installation.qmd (3 hunks)
  • pyproject.toml (3 hunks)
  • requirements-dev.txt (0 hunks)
  • requirements-tests.txt (0 hunks)
  • requirements.txt (0 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • scripts/unsloth_install.py (1 hunks)
  • setup.py (0 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (1 hunks)
  • src/setuptools_axolotl_dynamic_dependencies.py (0 hunks)
💤 Files with no reviewable changes (7)
  • .coveragerc
  • requirements-dev.txt
  • requirements-tests.txt
  • cicd/Dockerfile-uv.jinja
  • src/setuptools_axolotl_dynamic_dependencies.py
  • setup.py
  • requirements.txt
🧰 Additional context used
🪛 actionlint (1.7.7)
.github/workflows/pypi.yml

53-53: workflow command "set-output" was deprecated. use echo "{name}={value}" >> $GITHUB_OUTPUT instead: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions

(deprecated-commands)

.github/workflows/tests.yml

250-250: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; dockerfile: string; num_gpus: number; python_version: number; pytorch: string}

(expression)


313-313: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; num_gpus: number; python_version: number; pytorch: string}

(expression)


358-358: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; num_gpus: number; python_version: number; pytorch: string}

(expression)

.github/workflows/multi-gpu-e2e.yml

65-65: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}

(expression)

.github/workflows/tests-nightly.yml

125-125: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}

(expression)


170-170: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}

(expression)

🪛 Hadolint (2.12.0)
cicd/Dockerfile.jinja

[error] 1-1: unexpected 'B'
expecting '#', '', ADD, ARG, CMD, COPY, ENTRYPOINT, ENV, EXPOSE, FROM, HEALTHCHECK, LABEL, MAINTAINER, ONBUILD, RUN, SHELL, STOPSIGNAL, USER, VOLUME, WORKDIR, a pragma, at least one space, or end of input

(DL1000)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: preview
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
🔇 Additional comments (12)
scripts/unsloth_install.py (1)

35-35: Use a single line, but keep it safe if earlier guards exit.

The combined command is fine given the new early exits; LGTM once guards are added.

.github/workflows/tests-nightly.yml (4)

46-49: Good move switching to uv in CI.
This consolidates tooling and should speed up env setup.


53-53: Pin torchvision to the torch matrix to avoid ABI/API mismatches.
torchvision often requires a torch-matched build; installing unpinned next to a pinned torch can break imports at runtime.

Suggested tweak:

-uv pip install --system torch==${{ matrix.pytorch_version }} torchvision
+uv pip install --system "torch==${{ matrix.pytorch_version }}" "torchvision==${{ matrix.pytorch_version }}"

65-67: Nice: uv install respects preinstalled torch and editable dev extra.
This aligns with the PR’s “uv-first” goal.


136-137: Use -m consistently for modal runs or drop it everywhere.
Single-GPU uses -m; multi-GPU below does not. Prefer consistency to avoid ambiguity between file vs module mode.

-modal run -m cicd.e2e_tests
+modal run -m cicd.e2e_tests

(If you choose not to use -m, remove it in both places.)

pyproject.toml (2)

96-106: Verify availability of platform-marked deps.
Ensure flash-attn, deepspeed, deepspeed-kernels resolve from your configured indexes on CI runners; otherwise lock or provide extra find-links.

I can script-check PyPI availability if useful.


144-144: Confirm correct optimizer package
The dependency [email protected] is a valid PyPI package; verify this is the intended library. If you meant to use torch-optimizer, replace it with torch-optimizer==0.3.0 (or your desired version).

cicd/multigpu.py (1)

26-26: Default BASE_TAG updated to -uv variant — good alignment.
Matches the uv-first Docker base images.

cicd/single_gpu.py (1)

26-26: Default BASE_TAG updated to -uv variant — LGTM.
Keeps single-GPU path consistent with multi-GPU and Dockerfile.

docker/Dockerfile (2)

1-2: Switch to -uv base — LGTM.
Consistent with the CI and Modal changes.


25-28: Install to system site-packages for clarity.
Without --system, uv may create/target a venv; installing to system avoids PATH surprises in downstream shells/entrypoints.

-uv pip install --no-build-isolation -e .[ring-flash-attn,optimizers,ray,$AXOLOTL_EXTRAS] $AXOLOTL_ARGS
+uv pip install --system --no-build-isolation -e .[ring-flash-attn,optimizers,ray,$AXOLOTL_EXTRAS] $AXOLOTL_ARGS
@@
-uv pip install --no-build-isolation -e .[ring-flash-attn,optimizers,ray] $AXOLOTL_ARGS
+uv pip install --system --no-build-isolation -e .[ring-flash-attn,optimizers,ray] $AXOLOTL_ARGS

Likely an incorrect or invalid review comment.

cicd/Dockerfile.jinja (1)

3-3: Confirm CUDA arch list and PTX fallback

New value adds 9.0+PTX and drops “+PTX” from 8.6. Verify target GPUs (e.g., H100, A100) and that PTX fallback choice is intentional to balance binary size vs compatibility.

@djsaunde djsaunde force-pushed the uv-first branch 2 times, most recently from 17b1baf to 3a4295c Compare September 15, 2025 21:13
@djsaunde djsaunde marked this pull request as ready for review September 16, 2025 19:45
@djsaunde
Copy link
Member Author

Opening early for CI.

@djsaunde djsaunde marked this pull request as ready for review October 4, 2025 13:09
@djsaunde
Copy link
Member Author

djsaunde commented Oct 4, 2025

This is pretty much good to go, save for migrating all the docker images to build on the uv base image. The old base images should be dropped (I think?). Frankly, I'm a bit bad at docker so hopefully I didn't break anything 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

uv-first

1 participant